76 research outputs found
Privacy CURE: Consent Comprehension Made Easy
Although the General Data Protection Regulation (GDPR) defines several potential legal bases for personal data processing, in many cases data controllers, even when they are located outside the European Union (EU), will need to obtain consent from EU citizens for the processing of their personal data. Unfortunately, existing approaches for obtaining consent, such as pages of text followed by an agreement/disagreement mechanism, are neither specific nor informed. In order to address this challenge, we introduce our Consent reqUest useR intErface (CURE) prototype, which is based on the GDPR requirements and the interpretation of those requirements by the Article 29 Working Party (i.e., the predecessor of the European Data Protection Board). The CURE prototype provides transparency regarding personal data processing, more control via a customization, and, based on the results of our usability evaluation, improves user comprehension with respect to what data subjects actually consent to. Although the CURE prototype is based on the GDPR requirements, it could potentially be used in other jurisdictions also
Compliance Using Metadata
Everybody talks about the data economy. Data is collected stored, processed and re-used. In the EU, the GDPR creates a framework with conditions (e.g. consent) for the processing of personal data. But there are also other legal provisions containing requirements and conditions for the processing of data. Even today, most of those are hard-coded into workflows or database schemes, if at all. Data lakes are polluted with unusable data because nobody knows about usage rights or data quality. The approach presented here makes the data lake intelligent. It remembers usage limitations and promises made to the data subject or the contractual partner. Data can be used as risk can be assessed. Such a system easily reacts on new requirements. If processing is recorded back into the data lake, the recording of this information allows to prove compliance. This can be shown to authorities on demand as an audit trail. The concept is best exemplified by the SPECIAL project https://specialprivacy.eu (Scalable Policy-aware Linked Data Architecture For PrivacyPrivacy, TransparencyTransparency and ComplianceCompliance). SPECIAL has several use cases, but the basic framework is applicable beyond those cases
Doc2RDFa: Semantic Annotation for Web Documents
Ever since its conception, the amount of data published on the worldwide
web has been rapidly growing to the point where it has become an important
source of both general and domain specific information. However, the majority
of documents published online are not machine readable by default. Many researchers
believe that the answer to this problem is to semantically annotate these
documents, and thereby contribute to the linked "Web of Data". Yet, the process
of annotating web documents remains an open challenge. While some efforts towards
simplifying this process have been made in the recent years, there is still a
lack of semantic content creation tools that integrate well with information worker
toolsets. Towards this end, we introduce Doc2RDFa, an HTML rich text processor
with the ability to automatically and manually annotate domain-specific Content
I Agree: Customize your Personal Data Processing with the CoRe User Interface
The General Data Protection Regulation (GDPR) requires, except for some predefined scenarios (e.g., contract performance, legal obligations, vital interests, etc.), obtaining consent from the data subjects for the processing of their personal data. Companies that want to process personal data of the European Union (EU) citizens but are located outside the EU also have to comply with the GDPR. Existing mechanisms for obtaining consent involve presenting the data subject with a document where all possible data processing, done by the entire service, is described in very general terms. Such consent is neither specific nor informed. In order to address this challenge, we introduce a consent request (CoRe) user interface (UI) with maximum control over the data processing and a simplified CoRe UI with reduced control options. Our CoRe UI not only gives users more control over the processing of their personal data but also, according to the usability evaluations reported in the paper, improves their comprehension of consent requests
Big Data and Analytics in the Age of the GDPR
The new European General Data Protection Regulation places stringent restrictions on the processing of personally identifiable data. The GDPR does not only affect European companies, as the regulation applies to all the organizations that track or provide services to European citizens. Free exploratory data analysis is permitted only on anonymous data, at the cost of some legal risks.We argue that for the other kinds of personal data processing, the most flexible and safe legal basis is explicit consent. We illustrate the approach to consent management and compliance with the GDPR being developed by the European H2020 project SPECIAL, and highlight some related big data aspects
Towards Querying in Decentralized Environments with Privacy-Preserving Aggregation
The Web is a ubiquitous economic, educational, and collaborative space.
However, it also serves as a haven for personal information harvesting.
Existing decentralised Web-based ecosystems, such as Solid, aim to combat
personal data exploitation on the Web by enabling individuals to manage their
data in the personal data store of their choice. Since personal data in these
decentralised ecosystems are distributed across many sources, there is a need
for techniques to support efficient privacy-preserving query execution over
personal data stores. Towards this end, in this position paper we present a
framework for efficient privacy preserving federated querying, and highlight
open research challenges and opportunities. The overarching goal being to
provide a means to position future research into privacy-preserving querying
within decentralised environments
Towards Making Distributed RDF processing FLINker
In the last decade, the Resource Description Framework (RDF) has become the de-facto standard for publishing semantic data on the Web. This steady adoption has led to a significant increase in the number and volume of available RDF datasets, exceeding the capabilities of traditional RDF stores. This scenario has introduced severe big semantic data challenges when it comes to managing and querying RDF data at Web scale. Despite the existence of various off-the-shelf Big Data platforms, processing RDF in a distributed environment remains a significant challenge. In this position paper, based on an indepth analysis of the state of the art, we propose to manage large RDF datasets in Flink, a well-known scalable distributed Big Data processing framework. Our approach, which we refer to as FLINKer extends the native graph abstraction of Flink, called Gelly, with RDF graph and SPARQL query processing capabilities
Characteristic sets profile features: Estimation and application to SPARQL query planning
RDF dataset profiling is the task of extracting a formal representation of a dataset’s features. Such features may cover various aspects of the RDF dataset ranging from information on licensing and provenance to statistical descriptors of the data distribution and its semantics. In this work, we focus on the characteristics sets profile features that capture both structural and semantic information of an RDF dataset, making them a valuable resource for different downstream applications. While previous research demonstrated the benefits of characteristic sets in centralized and federated query processing, access to these fine-grained statistics is taken for granted. However, especially in federated query processing, computing this profile feature is challenging as it can be difficult and/or costly to access and process the entire data from all federation members. We address this shortcoming by introducing the concept of a profile feature estimation and propose a sampling-based approach to generate estimations for the characteristic sets profile feature. In addition, we showcase the applicability of these feature estimations in federated querying by proposing a query planning approach that is specifically designed to leverage these feature estimations. In our first experimental study, we intrinsically evaluate our approach on the representativeness of the feature estimation. The results show that even small samples of just 0.5% of the original graph’s entities allow for estimating both structural and statistical properties of the characteristic sets profile features. Our second experimental study extrinsically evaluates the estimations by investigating their applicability in our query planner using the well-known FedBench benchmark. The results of the experiments show that the estimated profile features allow for obtaining efficient query plans
- …